Effect of different noise reduction techniques and template matching parameters on markerless tumor tracking using dual‐energy imaging

Abstract Purpose To evaluate the impact of various noise reduction algorithms and template matching parameters on the accuracy of markerless tumor tracking (MTT) using dual‐energy (DE) imaging. Methods A Varian TrueBeam linear accelerator was used to acquire a series of alternating 60 and 120 kVp images (over a 180° arc) using fast kV switching, on five early‐stage lung cancer patients. Subsequently, DE logarithmic weighted subtraction was performed offline on sequential images to remove bone. Various noise reduction techniques—simple smoothing, anticorrelated noise reduction (ACNR), noise clipping (NC), and NC‐ACNR—were applied to the resultant DE images. Separately, tumor templates were generated from the individual planning CT scans, and band‐pass parameter settings for template matching were varied. Template tracking was performed for each combination of noise reduction techniques and templates (based on band‐pass filter settings). The tracking success rate (TSR), root mean square error (RMSE), and missing frames (percent unable to track) were evaluated against the estimated ground truth, which was obtained using Bayesian inference. Results DE‐ACNR, combined with template band‐pass filter settings of σ low = 0.4 mm and σ high = 1.6 mm resulted in the highest TSR (87.5%), RMSE (1.40 mm), and a reasonable amount of missing frames (3.1%). In comparison to unprocessed DE images, with optimized band‐pass filter settings of σ low = 0.6 mm and σ high = 1.2 mm, the TSR, RMSE, and missing frames were 85.3%, 1.62 mm, and 2.7%, respectively. Optimized band‐pass filter settings resulted in improved TSR values and a lower missing frame rate for both unprocessed DE and DE‐ACNR as compared to the use previously published band‐pass parameters based on single energy kV images. Conclusion Noise reduction strategies combined with the optimal selection of band‐pass filter parameters can improve the accuracy and TSR of MTT for lung tumors when using DE imaging.


INTRODUCTION
Markerless tumor tracking (MTT) is a technique being considered for the management of lung tumor motion, particularly for stereotactic body radiation therapy (SBRT). [1][2][3][4][5][6][7] Several studies have evaluated MTT with MV 1,2 or kV [3][4][5][6][7] images acquired during treatment. MTT using kV imaging relies on the tumor being consistently visible on the image projections. 5 However, visualizing the tumor is challenging when it is obstructed by a highdensity object, such as bone. 5 For these cases, various techniques have been investigated to increase the visibility of lung tumors. 4,[6][7][8][9][10] One of these approaches is dual-energy (DE) subtraction imaging that improves the contrast of lung tumors by removing bone and has been shown to increase the accuracy of MTT. [8][9][10][11][12][13] However, this improved tumor contrast comes at the cost of increased image noise, 13 which may decrease the accuracy of MTT. To regain the benefits of bone removal, several noise reduction techniques have been investigated for DE imaging primarily in the diagnostic setting. [14][15][16][17][18] However, to our knowledge, the impact of DE noise reduction techniques has not been thoroughly studied in the context of MTT.
Although several tracking approaches are being considered for MTT, we and others have been investigating the use of template tracking algorithms. 6,[9][10][11] Template tracking uses 2-D templates of the tumor generated from the planning CT scan at the corresponding imaging angle. The template is scanned within a defined search window and the normalized cross-correlation (NCC) is calculated between template and image. The tracked location is the one that maximizes the NCC. To improve tracking performance, a band-pass filter is often applied to the template and image. 11 The optimal selection of band-pass parameters is important in maximizing the success of template tracking. 19,20 For example, if the σ values associated with the band-pass filter are too low, too much detail is preserved, and very few matched locations are found. 20 However, if the σ values are too high, there is little detail remaining in the images and templates, resulting in a low probability of a match. 20 A recent study optimized the band-pass filter parameters on single energy (SE) kV images for template tracking. 20 They determined that the combination of σ low = 0.6 mm and σ high = 2.4 mm worked best for most patients to improve the tracking accuracy on SE images.
Due to the inherent differences between DE and SE images, as well as the additional noise processing techniques that may be applied to DE images, it is not clear that the band-pass parameters determined by Hazelaar 20 will be optimal for MTT using DE images. Thus, the goal of this study is to evaluate various DE noise reduction techniques, along with template tracking parameters, to determine the optimal combination that maximizes the success of MTT with DE imaging.

Patient data and dual-energy subtraction
Five early-stage lung cancer patients (Table 1) treated with SBRT were imaged as part of an IRB-approved study. Prior to treatment, all patients underwent a 4D-CT simulation and treatment planning according to our institutional protocol. 21 Patients were simulated in the supine position and immobilized using an alpha cradle indexed to the treatment table.Using the 4D-CT,an internal target volume was generated by contouring (Eclipse, Varian Medical Systems, Palo Alto, CA) the gross tumor volume on the free-breathing, 0%-phase, 50%-phase, and the maximum-intensity-projection series. Treatment planning was performed following our standard methods. 21 All plans used volumetric modulated arc therapy and were optimized to meet our SBRT dose constraints. Typically, 2-3 arcs were used per fraction with an overall prescription dose of 5000-6000 cGy delivered in 3-5 fractions to the planning target volume.
At the completion of one of the treatment fractions, for each patient, a series of alternating projections were obtained at 60 and 120 kVp over a 180 • arc using the fast kV switching capabilities of the onboard imager in Developer Mode (TrueBeam, Varian Medical Systems, Palo Alto, CA). 10 Images were acquired at 15 frames/s producing a total of 450 high/low-energy images. The mA setting for the images was adjusted to minimize the difference in air exposure between the 60 kVp (60 mA, 20 ms) and 120 kVp (15 mA, 20 ms) acquisitions. [9][10][11] To create the DE images, weighted logarithmic subtraction (WLS) was performed offline on paired 60/120 kVp images to reduce bone present in the resultant soft-tissue image. [9][10][11] This study required the production of both soft-tissue DE images (I DEST ), as well as bone DE images (I DEB ) for use with subsequent noise reduction algorithms. We performed WLS to generate I DEST and I DEB as follows 22 : where I H and I L are the intensities of individual pixels on the high-and low-energy projections, respectively. The weighting factors, w ST and w B , were used to produce I DEST and I DEB images and were empirically determined to be 0.42 and 0.70, respectively. Various noise reduction techniques were applied to these DE images using software written in MATLAB R2021a (MathWorks, Natick, MA, USA) and are discussed in the following sections.

Simple smoothing (SS) of the high-energy image
The high-energy image was smoothed using a median filter prior to DE image processing. 14,15 The rationale is that most of the noise in the resultant DE image arises from the high-energy image component. 15 Additionally, the unfiltered low-energy projection (I L ) preserves high spatial-frequency information. 14 In the present study, we used a 3 × 3 median filter on the high-energy image (I H ) as follows: where I' H is the simple smoothed high-energy image, and I SSF is the resultant DE image following simple smoothing (SS).

Anti-correlated noise reduction (ACNR)
The anticorrelated noise reduction (ACNR) algorithm takes advantage of the fact that the noise is anticorrelated between the DE bone and soft-tissue images. 16 Kalender et al. demonstrated that the noise in the complementary image can be reduced by estimating the noise in either the bone or tissue image. To reduce noise in the soft-tissue image, the ACNR algorithm applies a high-pass filter (average filter with kernel size of 20 × 20 pixels) to the complementary image, that is, the boneonly image (I DEB ). This filter effectively removes the bone from the complementary image, leaving the noise behind. The original DE soft-tissue image (I DEST ) and high-pass filtered bone image (I HPF DEB ) are then added, weighted by a parameter, w n 18 : The optimal value of w n was determined empirically to be 0.35.

Noise clipping (NC)
The noise-clipping (NC) technique, developed by Hinshaw et al., is based on the concept that due to increased photoelectric effect, structures in the lowenergy image (I L ) should have a higher contrast that those in the high-energy image (I H ). 17,18 A median filter kernel (17 × 17) was used to estimate the background signal in both images. 18 The background was then subtracted from individual pixel values in both images, and the contrast of each paired pixel in the resultant images was compared. If the high-energy pixel contrast was greater than that of the low energy, the increased contrast was attributed to noise, and its value was clipped.
That is, the high-energy pixel value is adjusted to match the contrast of the low-energy pixel. The algorithm is implemented on a pixel-by-pixel basis: ifI H − median (I H ) > Th (7) then The resultant noise-clipped high-energy image was used to produce a DE image using WLS, defined in Equation (1). Furthermore, we also performed ACNR using the noise-clipped high-energy images as discussed in the previous section. An example of the unprocessed and noise-filtered DE images for a representative patient is shown in Figure 1.

Template tracking
To determine the position of the tumor, template matching was performed using nonclinical offline research software (RapidTrack Offline [RTO.3.0.3.0] Varian Medical Systems, Palo Alto, CA, USA). 23 Tracking is a two-step process consisting of template generation, followed by the tracking itself. First, 2D templates of the tumor were derived from the 50% phase of the 4D-planning CT scan for each individual patient. These templates were generated for every 1 • of gantry rotation. Templates and images were preprocessed using a band-pass filter software that was implemented by subtracting a high-pass Gaussian-shaped kernel (σ high ) from a low-pass Gaussian-shaped kernel (σ low ). To optimize the band-pass filter settings for the present study, 20 different combinations of σ low and σ high values were used. σ low was varied from 0.

Ground-truth estimation and tracking metrics
In clinical images, the ground-truth (GT) for MTT is not known. Hence, estimated GT positions were determined using Bayesian inference based on the Kalman filter (KF) with constant acceleration. [24][25][26] The KF provides an effective estimator for the position of a dynamical system in the presence of measurement error. As such, it is a statistical inference approach to estimate the true position of an object by combining the predicted and measured positions. Previous studies have used the KF approach to estimate the 3D position of tumors from measured data. 24,25 In the case of prostate cancer, Nguyen et al. demonstrated that the KF can estimate the true prostate position with submillimeter precision and accuracy. 25 In the current study, the KF is used to estimate GT through an iterative two-step process of prediction and correction. 24,25 The variables associated with the KF equations are defined in Table 2.
Briefly, the prediction step estimates the tumor position prior to measurement using a state transition model. 24 The predicted vector position x k p in the kth frame is based on the information of tumor motion from the previous frame (k−1) and is given by Similarly, the measurement error covariance matrix, P kp , is estimated based on the previous error P k−1 as given by Using the current measurement, z k , the estimated position x k (our estimated GT) is given by P k is calculated for future prediction and update: The Kalman gain factor, K k , which is used to correct the prediction, is given by An additional theoretical foundation can be obtained in the original paper by Kalman. 26 Using the estimated GT from the KF, a quantitative analysis was performed based on tracking success rate (TSR), root mean square error (RMSE), and the percentage of missing frames (number of frames that were not tracked by the MTT software). The TSR is based on the difference between the actual and tracked locations. A successful tracking event on a particular image frame is defined as a difference between the tracked location and estimated GT positions of <2 mm. RMSE was calculated by the square root of the square of the difference between the actual tracked location and GT of the target, divided by the number of frames for each patient as given by where x i and x t are actual tracked location and estimated GT, respectively, and N tr is the total number of tracked frames per patient.

Effect of noise reduction techniques on DE images
In total, the data from five patients resulted in 1140 DE images that were analyzed for the unprocessed and noise-filtered DE images for a total of 5700 images.  percent missing frames where the algorithm was unable to track the tumor. From Figure 3, the percent missing frames increase nearly linearly as a function of σ low for both DE and DE-ACNR. Although the lowest percentage of missing frames occurs at σ low = 0.2 mm, this occurs at the expense of lower TSR and higher RMSE.

Template matching optimization: band-pass filter settings
Based on our three factors, TSR, RMSE, and the percent missing frames, we seek to determine the best band-pass parameter settings for unprocessed DE and DE-ACNR. Table 3 shows the optimal combination of parameters along with the results using the parameters that were optimized for SE imaging. From this analysis, there is an advantage of using DE-ACNR versus unprocessed DE due to an increase in TSR and a reduction in RMSE. In both cases, the percent of missing frames is comparable. In comparison to the SE optimized values from Hazelaar et al., 20 using DE-optimized band-pass parameters provides an increase in TSR and comparable RMSE values. However, the most significant advantage of using DE-optimized parameters is an ∼50% relative reduction in the number of missing frames where the software is unable to track the tumor. A low missing frame rate is essential for the eventual clinical implementation of such an approach.
Although motion tracking with DE images is currently not clinically available, the use of optimized template parameters and application of ACNR to DE images would fit seamlessly into a future clinical workflow. Templates for motion tracking would be generated for the planning CT scan prior to patient treatment and motion tracking. At this stage, the optimal band-pass filter settings (σ low = 0.4 mm and σ high = 1.6 mm) would be utilized. During treatment and image acquisition, softtissue and bone DE images would be produced using logarithmic subtraction with appropriate weighting factors. The high-pass filtered bone DE image is combined with the soft-tissue DE image using the optimized noise cancelation weighting factor as described in Section 2.3 to produce DE-ACNR images. All these calculations are matrix based and hence are computationally effi-cient, allowing them to be performed in near real-time. Motion tracking, using the previously calculated templates would then be performed on the DE-ACNR images.
To our knowledge, this is the first study to evaluate DE noise reduction techniques and to determine the optimal band-pass parameters for template-based motion tracking. However, there are a number of limitations. Among these, the most important are the study-optimized parameters based on a small series of patients. However, as shown in the TSR and RMSE plots, there are clear maxima and minima values, respectively. The inclusion of additional patients is not expected to change the overall results; however, it may impact the absolute values (i.e., TSR and RMSE). Additionally, as this study uses patient data, GT was not directly assessable and had to be estimated using the KF. Although the KF has been shown to produce the best estimate of the underlying signal from noisy data, 24-26 it nonetheless does not provide an absolute measurement of GT as would be obtained from a phantom. However, most phantoms do not have realistic anatomy and tumor geometries, as well as respiratory and cardiac motion. Hence, we believe the inclusion of these factors outweigh the use of an estimated GT. Future studies will consider of the use of deep learning-based tracking algorithms, and how DE image quality may impact the accuracy of tumor tracking using these approaches.

AU T H O R C O N T R I B U T I O N S
All listed authors contributed to the work and to writing the article.

AC K N OW L E D G M E N T
Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Number R01-CA207483. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.